cd/entity/Nicholas Carlini· home› entities› Nicholas Carlini

grep -l @nicholas carlini /news/*.json | wc -l → 17

Nicholas Carlini

mentions 17 type Person feed RSS

// recent coverage 17 mentions

12:09

2026-06-29

dev.to

artificial-intelligence

Building Stuff That Doesn't Leak Everyone's Data

Developer Maneshwar is building git-lrc, a free and source-available Micro AI code reviewer that runs on every commit. The project highlights the critical need for data privacy in AI systems, warning …

01:20

2026-06-24

letsdatascience.com

artificial-intelligence

Anthropic Model Discovers Vulnerabilities in US Classified Systems

Anthropic's Mythos AI model identified vulnerabilities in highly sensitive U.S. government computer systems during a testing exercise, with Senator Mark Warner claiming it broke into almost all classi…

14:23

2026-06-18

letsdatascience.com

ai-safety

US Directive Suspends Anthropic's Mythos and Fable Access

The US government issued an export control directive on June 12 suspending access to Anthropic's Mythos 5 and Fable 5 models for all foreign nationals, prompting Anthropic to disable the models for al…

10:25

2026-06-17

runtimewire.com

ai-safety

Anthropic's Mythos fight turns on the hacker who first told it to slow down

Anthropic researcher Nicholas Carlini, who initially warned colleagues in March that the company's next-generation AI model Mythos was too capable to release, has become central to Anthropic's argumen…

08:16

2026-06-17

wsj.com

ai-safety

The Hacker Sent by Anthropic to Calm the Government's Nerves About AI Safety

Anthropic sent hacker Nicholas Carlini to reassure government officials about AI safety concerns, leveraging his expertise in adversarial machine learning to demonstrate the company's commitment to re…

02:06

2026-06-17

cryptobriefing.com

ai-safety

Anthropic sends researcher to address US government AI safety concerns

Anthropic sent security researcher Nicholas Carlini to Washington to address US government concerns about AI safety after export control directives forced the company to temporarily suspend global acc…

02:26

2026-05-30

red.anthropic.com

large-language-models

Measuring LLMs' ability to develop exploits

Anthropic researchers found that its Claude Mythos Preview model can identify complex software vulnerabilities and combine them into complete end-to-end attack chains, marking a significant advancemen…

00:00

2025-11-19

nicholas.carlini.com

large-language-models

Are large language models worth it?

Large language models (LLMs) may be transformative but pose serious risks and are already causing harm, prompting the author to question whether they are "worth it." The author, Nicholas Carlini, work…

00:00

2025-03-17

nicholas.carlini.com

artificial-intelligence

Machines of Ruthless Efficiency

Nicholas Carlini argues that advanced AI systems pose significant societal risks precisely because they are designed for "ruthless efficiency." He outlines a spectrum of concerns, from current harms l…

00:00

2025-03-13

nicholas.carlini.com

artificial-intelligence

My Thoughts on the Future of "AI"

In a 2025 blog post, researcher Nicholas Carlini argues that the future of large language models is highly uncertain, with two plausible but opposing outcomes. He suggests that within three to five ye…

00:00

2025-03-11

nicholas.carlini.com

artificial-intelligence

What my privacy papers (don't) have to say about copyright and generative AI

In his article, Nicholas Carlini explains that his research on "memorization" in machine learning models, which demonstrates that models can sometimes output verbatim training data, is often cited in …

00:00

2025-03-05

nicholas.carlini.com

artificial-intelligence

Career Update: Google DeepMind -> Anthropic

Nicholas Carlini announced he is leaving Google DeepMind after seven years to join Anthropic for one year, citing disagreements with DeepMind leadership over its support for high-impact security and p…

00:00

2025-02-09

nicholas.carlini.com

artificial-intelligence

AI forecasting retrospective: you're (probably) over-confident

In a 2025 article, Nicholas Carlini asked readers to make 30 forecasts about AI in 2027 and 2030, requiring them to give 90% confidence intervals rather than point estimates. Analyzing the responses, …

00:00

2024-12-25

nicholas.carlini.com

large-language-models

Letting Language Models Write my Website

Nicholas Carlini describes a project where he uses a different large language model (LLM) each day for twelve days to completely rewrite his personal website homepage and bio. He prompts each model to…

00:00

2024-11-25

nicholas.carlini.com

artificial-intelligence

You should forecast the future of AI

Many people hold overly confident but vague predictions about AI's future, which are often proven wrong. To address this, author Nicholas Carlini presents a set of about 30 specific, refutable questio…

00:00

2023-09-22

nicholas.carlini.com

large-language-models

Playing chess with large language models

According to Nicholas Carlini's 2023 article, while computers have surpassed humans in chess for decades using specialized game-playing models, OpenAI's GPT-3.5-turbo-instruct—a language model designe…

00:00

2023-08-03

nicholas.carlini.com

large-language-models

Little Bobby |endoftext|

A command injection vulnerability in GPT-4 discovered by Nicholas Carlini, where the model's use of the `<|endoftext|>` token can cause it to abruptly end a response and begin generating unrelated con…

// co-occurs with top 8 entities

Anthropic 9 Claude 2 Fable 5 2 Mythos 5 2 Project Glasswing 2 OpenAI 2 GPT-4 2 GPT-2 2